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Abstract 

We construct a priori error estimation for the force error of the twin-range cutoff method, which 
is widely used to treat the short-range non-bonded interactions in molecular simulations. Based on 
the error and cost estimation, we develop a work flow that can automatically determine the nearly 
i-C " most efficient twin-range cutoff parameters (i.e. the cutoff radii and the neighbor list updating 

q_j frequency) prior to a simulation for a predetermined accuracy. Both the error estimate and the 

parameter tuning method are demonstrated to be effective by testing simulations of the standard 
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Lennard-Jones 6-f2 fluid in gas, liquid as well as supercritical state. We recommend the tuned 
twin-range cutoff method that can save precious user time and computational resources. 
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I. INTRODUCTION 



Non-bonded interactions are encountered in nearly every molecular simulation, but their 
calculations are computationally expensive. It is thus important to develop computational 
methods that boost both efficiency and accuracy at the same time. Non-bonded interactions 
are mainly formed by a pairwise interaction u(r), where r is the distance between two 
interacting particles. There are two types of pairwise interactions, namely the long-range 
interaction and the short-range interaction, depending on the rate at which u(r) decay with 
respect to the distance. 

Most short-range interactions satisfy \u(r)\ < CV~ m , m > 3, which guarantees an absolute 
convergence of the energy. A naive idea to treat short-range interactions is to explicitly 
calculate and sum all pairwise interactions. This results in a computational cost scaling 
0(N 2 ) per time step, which rapidly becomes inefficient as the number of particles N grows 
large. A better way is to introduce a cutoff radius, outside of which all pairwise interactions 
are simply neglected. In combination with the cell list and the neighbor list algorithms [1], 
the total computational cost of the short-range interactions can be reduced to an acceptable 
level oiO{N). 

It is well known that various physical properties show significant dependence on the 
cutoff radius and the method to treat the discontinuity at the cutoff, for example, the phase 
diagram of the Lennard- Jones fluid [2-4], the density profile of the liquid- vapor interface 
[5, 6], the surface tension [3, 7, 8] and the free energy calculation [4]. Ill-chosen cutoff can 
lead to undesirable artifacts, for example, the phase diagrams of the Lennard- Jones fluid 
have been demonstrated to be substantially different with different choices of cutoff radii 
[2]. A straightforward way to eliminate the cutoff effects is to use an extremely large cutoff 
radius such that all properties of interest satisfactorily converge. This has motivated the 
adoption of a non-truncated potential (i.e. the cutoff radius is the same as half of the 
simulation box) to study the critical point phenomena [9-12]. However, such a long cutoff 
radius drastically increases the computational cost, thereby prohibiting long time and large 
size molecular simulations. An alternative way is to use a small cutoff while applying long 
range correction (LRC) [13] to eliminate the systematic error of the potential energy, the 
pressure and the free energy. The idea is to integrate the thermodynamic properties from the 
cutoff radius to infinity, assuming the radial distribution function is equal to 1. Some more 
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sophisticated methods have been developed in recent years to improve LRC for the constant 
pressure simulation [14], the inhomogeneous systems [15] and the free energy calculation of 
systems containing macromolecules [16]. 

A promising way to quantitatively analyze the undesirable cutoff effects is to express 
these artifacts in terms of the difference between the cutoffed interaction and the exact 
interaction. There are several error analysis works on the long-range interaction algorithms 
[17-21], while the analysis on the short-range interaction is scarce. In this paper, we develop 
the error estimate of the short-range force introduced by the widely used twin-range cutoff 
method (which is reduce the commonly used single-range cutoff method by a special choice 
of parameters) in homogeneous systems. Furthermore, to automatically determine the most 
efficient cutoff radii and neighbor list updating frequency, a work flow optimizing these 
working parameters with respect to speed and accuracy is proposed and tested. 

II. THEORY 

A. The twin range cutoff method 

The twin-range cutoff method introduces two cutoffs, denoted by r\ and r 2 , with r\ < r 2 . 
Assuming the neighbor list updating frequency is M , then at every M steps, the neighbor 
list is built for all neighboring particles that fall in the short cutoff radius ri and the cor- 
responding interactions are calculated and applied every next M — 1 steps. The neighbors 
fall in between the short cutoff and the long cutoff are assumed to move slowly, so the in- 
teractions can be calculated less frequently. At every M steps, the interactions with these 
particles are calculated and stored. In the subsequent M — 1 steps, they are applied to the 
corresponding particles without any change. When r\ = r 2 , the twin-range cutoff method 
reduces to the single-range cutoff method. 

B. The error estimate of the twin-range cutoff method 

For simplicity, we will only study the error estimate of single component systems, because 
it is not difficult to derive the error estimate for multicomponent systems in the same way. 
We consider a reference particle i and denote the sets of neighbors fall in r±, between r\ 
and r 2 , and out of r 2 by Q\, Q l 2 and Q l 3 , respectively. The position of any particle j is 
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denoted by rj at the step when the neighbor list is built. In the following several steps, the 
accumulated absolute displacement is denoted by dj. Then the exact force F* on particles 
i and the cutoffed force F i by the twin-range cutoff method are 

F* = F(r tJ + dy) + F(ry + dy) + F(r tj + dy), (1) 

jeoi jgn^ je^ 

= ]T F(ry + dy) + F( rij ). (2) 

Therein the relative position is ry = r« — rj and the relative displacement is dy = dj — dj. 
The difference between the exact force and the cutoffed force is therefore 

AF = F* - Ft 

= + d ^ - F ^)\ + E F ^ + d ^ 

« ^ VF(ry) ■ dy + ^ [f (r y ) + V-F(ry) • dy]. (3) 

The last approximation holds by Taylor expansions, so it is important to keep in mind 
that (3) is only valid when |dy| is small. To estimate the error of the force, it is crucial to 
provide a proper definition of the word "error" first. We adopt the widely used root mean 
squared (RMS) force error S(ri,r2, M) = a/ (\AF\ 2 ), where the (•) is the average over all 
positions i — 1, • ■ • , N and all displacements dj, i — 1, • • ■ , N. 
To calculate this error we need some assumptions: 

1. i — 1, • • ■ , N are random variables with uniform distributions over the space. 

2. dj, 2 = 1, ■ ■ ■ ,N are random variables having a normal distribution with mean and 
variance d 2 , namely jV(0, d 2 ). 

3. r*j and rj are independent, if i ^ j. 

4. dj and dj are independent, if z ^ j. 

5. r*j and dj are independent, for any i and j. 

The standard deviation d of the random variable dj can be approximately related to the 
neighbor list updating frequency M by 



d^MAul^^, (4) 
m 
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where is the Boltzmann constant, and m is the mass of the particles. By the theorem 
of equipartition, 3fcsT/m is the mean squared velocity. This relation is only valid when d 
is smaller than the mean free path. In all the test simulations studied in this paper, the 



largest deviation of d from MAt-y/3A;^T/m is less than 10%. All the above assumptions are 
ideal cases that facilitate the derivation of the error estimate. However, in systems studied 
in real problems, these assumptions can be violated. In section III, we will show when the 
real force error deviates from our theoretical estimate and to what extent the theoretical 
estimate is reliable. 

Based on the aforementioned assumptions, we reach the error estimate: 

f /*°° ^ /*oo 2 

£ 2 (r 1 ,r 2 ,M) =(4vrp)W / -r 2 u"(r)dr + / -ru'(r)dr 

I Jri 3 J ri 3 

{/*oo -r poo 9 

/ -[ru"(r)} 2 dr+ / -\u\r)] 2 di 
Jri 3 J ri 3 

POO 

4%p / [ru'(r)] 2 dr. (5) 

J r% 

It is straightforward to develop the error estimate of the short-range interaction that has 
the form uir) = 4e(a/r) m and m > 3: 

lfi /I 

£ 2 {r u r 2 ,M) = (— nmpea m ) 2 d 2 ' 



2m-2 



3 ' ' \r( 
2^777; TV" np{8mea m ) 2 d 2 1 



3(2m + 1) ' K 1 V^i 2m+1 

npiAme^ff^-,). (6) 



2m - 1 



C. Tuning the working parameters for the twin-range cutoff method 

In general, the calculation of the cutoffed short-range interaction with the neighbor list 
method is performed in two steps: 1, generate the neighbor list at every M steps. 2, 
calculate the interactions by using the neighbor list in the subsequent M — 1 steps. We use 
the following formulas to estimate the computational costs of the force calculation (Tp) and 
the neighbor list generation (T/v): 

T F = Cl rl + dx, (7) 
T N = c 2 rl + d 2 . (8) 
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Where ci, d±, c 2 and d 2 are constants depending on the system being studied, the software 
implementation, as well as the hardware architecture. For a good estimate of these constants, 
we time a series of short test runs with different cutoff radii and fit the computational costs 
in the least square sense. The average computational cost of the short-range interaction per 
step is 

T = T F + ±T N . (9) 

Provided with both the error and the computational cost estimates for the twin-range cut- 
off method, it is possible to design a routine that determines the most efficient combination 
of parameters r 1; r 2 and M. Here, the phrase "most efficient" refers to a set of parameters 
that reaches a given accuracy at minimal computational cost. From a mathematical point 
of view, this is a constrained optimization problem that can be written in the following way: 

min T{r l ,r 2 ,M), (10) 
s.t. £(n,r 2 ,M) = £* and (11) 
d(M)<d . (12) 

Where S* defines the required accuracy. Constraint (12) is added because the Taylor ex- 
pansions in (3) are good approximations only when d is small. In the multicomponent sys- 
tems, the displacement should be constrained for the lightest particles. We find do = 0.2a 
will give reasonable results. Since M can be analytically solved by constraint (11), saying 
M = M(r 1 ,r 2 ,£*), (10) - (12) is reduced to 

min T(n,r 2 ,M(ri,r 2 ,S*)), (13) 
s.t. d(M(n,r 2 ,E*)) <d (14) 

The constrained optimization problem (13) and (14) can be solved by standard optimization 
algorithms. 

III. RESULTS OF TESTING SIMULATIONS 

We ran molecular dynamics (MD) simulations on an Intel Xeon E5520 Processor with 
Gromacs 4.0.7 compiled by GCC 4.5. The testing systems used in our studies contained 
16,000 particles interacting via the standard Lennard- Jones 6-12 interaction. The MD time 
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/ L J 


£* [s/a] 


r\ [a] 


d [a] 


M 


£real [ £ / a ] 


W 4 T est [s] 


W 4 T real [s] 




10~ 2 


3.23 


0.200 


52 


1.10 x 10~ 2 


21.3 


19.7 


0.05 


10~ 3 


4.90 


0.200 


52 


1.09 x 10~ 3 


37.0 


36.8 




10" 4 


7.48 


0.200 


52 


1.12 x 10~ 4 


92.9 


93.1 




10~ 2 


3.91 


0.180 


44 


1.17 x 10~ 2 


85.1 


88.0 


0.30 


10~ 3 


5.87 


0.126 


31 


1.43 x 10~ 3 


266 


268 




10" 4 


8.91 


0.100 


25 


1.86 x 10~ 4 


914 
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10~ 2 


4.28 


0.110 


29 


0.69 x 10~ 2 


273 


251 


0.80 


10^ 3 


6.45 


0.081 


21 


0.64 x 10~ 3 


900 


873 




10" 4 


9.81 


0.067 


17 


0.61 x 10~ 4 


3168 


3120 



TABLE I: The tuned parameters of the single-range cutoff method. System in the gas (p = 0.05cr 3 , 
T = 1.20e/k B ), supercritical (p = 0.30cj- 3 , T = lMe/k B ) as well as liquid state (p = 0.80cr~ 3 , 
T = 1.20e/kB) were tested. £* is the target accuracy while the £ rea i is the real error calculated from 
the simulation. T es t is the computational expense estimated by (9). T rea i is the real computational 
cost timed in the test simulations. The unit of the computational costs is second per step. 

step was At = 0.002 r, where r = o\Jmje. NVT simulations were performed by coupling 
the systems to the Nose- Hoover thermostat with a relaxation time 1 r. Short test runs of 
10,000 steps, r\ = T2 — u and M = 20 were performed at different r2 to provide an estimate 
of the constants in the computational cost expressions (7) and (8). This process lasted for 
about an hour. It is worthwhile to spend this time because it is short comparing with the 
time costs of real simulations. Moreover, the constants can be reused in all simulations on 
the same machine with the same density. The error estimate (6) (m = 6 for Lennard- Jones 
interaction) was studied by the systems in the gas (T = 1.20e/kB, p = 0.05cr -3 ), liquid 
(T = 1.20e/k B , P = 0.80a~ 3 ) and supercritical state (T = lMe/k B , p = 0.30a' 3 ). Target 
precisions 10~ 2 e/a, 10~ 3 e/a and 10~ 4 e/a were tested. To measure the real accuracies, the 
cutoff radii equal to half the simulation boxes were employed and the resulting forces served 
as the exact forces. The tuned parameters, accuracies and computational costs of the single- 
and twin-range cutoff methods are listed in Tables I and II. 

In all cases presented, the constraint (11) is strictly satisfied, so the deviation of the real 
errors from the target errors measures the quality of the error estimates. In the gas state, 
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TABLE II: The tuned parameters of the twin-range cutoff method. System in the gas (p = 0.05cr -3 , 
T = 1.20e/ Isb), supercritical (p = 0.30a -3 , T = 1.34£//cb) as well as liquid state (p = 0.80cr~ 3 , 
T = 1.20e/kB) were tested. £* is the target accuracy while the £ rea i is the real error calculated from 
the simulation. T es t is the computational expense estimated by (9). T rea i is the real computational 
cost timed in the test simulations. The unit of the computational costs is second per step. The 
last column gives the acceleration ratios of the real computational costs with respect to the real 
costs of the single-range cutoff method. 

the error estimates are sharp. In the supercritical and liquid states, the error estimates do 
not deviate very far from the real errors, and all of them fall in the range [\£ r eah 2£reai\- In 
the supercritical cases, the errors tend to be underestimated, while in the liquid systems, the 
errors are somehow overestimated. The estimated RMS errors are not exactly the real values, 
because some of the assumptions in section II B are not preserved in real simulations. The 
assumptions 3, 4 and 5 are obviously not satisfied because any two particles cannot overlap 
due to the repulsive core of the Lennard- Jones interaction. Moreover, they are also violated 
by the correlations between particles due to the attractive dispersion term. It is possible 
to include the pair distribution information (i.e. the radial distribution function g(r) in the 
homogeneous systems) to improve the quality of the error estimates, but the estimates are 
then posterior rather than prior, which is not convenient for the parameter tuning. 

From Table I and II, it is obvious that higher target accuracies and larger system densities 
require more intensive parameters, i.e. larger cutoff radii and higher neighbor list updating 
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frequencies. In most g the displacement d hits the constraint (12). In all cases 

shown, the M are larger than the usually used value (M = 10, default in Gromacs). In 
the gas state, the M can be as large as 52. The twin-range cutoff method is always more 
efficient than the single-range cutoff method. However, in the liquid state, the twin-range 
cutoff method is a little bit less accurate than the single-range cutoff method. So the benefit 
of using twin-range cutoff should be discounted in the sense of the same accuracy. 

To test the benefit we gain by the parameter tuning, we used a set of unoptimized single- 
range parameters (r = 4.00a and M = 10) in the gas state (p = 0.05a~ 3 , T = 1.20s /ks)- 
The accuracy and efficiency of these parameters are £ = 3.01 x 10~ 3 s/a and T = 4.10x 10~ 3 s, 
respectively. To compare the efficiencies, we manually adjusted the target precisions £* so 
that the real accuracy of the tuned parameters is nearly the same as the corresponding 
unoptimized parameters. The tuned single-range parameters are r = 4.09a and M = 52 with 
computational cost T = 2.71 x 10~ 3 s. And the tuned twin-range parameters are r\ = 3.54cr, 
r 2 = 4.68a and M = 52 with T = 2.56 x 10~ 3 s. The tuned single- and twin-range parameters 
save 33% and 38% computational costs, respectively. The same unoptimized parameters 
result in £ = 6.10 x 10~ 3 s/a and T = 2.7 '4 x 10~ 2 s in the liquid state (p = 0.80a~ 3 , 
T = 1.20s/kB). At the same accuracy level, the tuned single-range parameters are r = 4.36a 
and M = 28 with the computational cost T = 2.76 x 10~ 2 s. And the tuned twin-range 
parameters are 7*1 = 3.94a, r 2 = 4.86a and M = 19 with T = 2.76 x 10~ 2 s. In this case, the 
tuned parameters do not improve the efficiency at all, because the unoptimized parameters 
are good enough. Moreover, since the error estimate (6) is not sharp in the liquid state and 
the constants c\, d\, C2 and c?2 in (7) and (8) are not exactly measured, the tuned parameters 
are only nearly optimal rather than really optimal. So the computational costs of the tuned 
parameters can be trivially more expensive than the unoptimized ones. We also tested a 
set of unoptimized twin-range parameters that are taken from one of our former simulation 
settings: r\ = 2.5a, r 2 = 4.0a and M = 10. In the gas state, the error is £ = 4.48 x 10~ 3 £/a 
with T = 3.87 x 10 _3 s. The tuned parameters are r\ = 3.28a, r 2 = 4.31a and M = 52 with 
the computational cost of T = 2.29 x 10~ 3 s, 40% cheaper. While in the liquid state, the 
accuracy and computational cost of the unoptimized parameters are £ = 2.91 x 10~ 2 e/a 
and T = 1.85 x 10 _2 s, respectively. The tuned parameters are r\ = 3.12a, r 2 = 3.81a and 
M = 27 with the computational cost of T = 1.37 x 10 _2 s, 26% faster than the unoptimized 
parameters. 
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IV. CONCLUSIONS AND OPEN QUESTIONS 



In this paper, an error estimate of the twin-range cutoff method was developed. The er- 
ror estimate of the single-range cutoff method was easily derived from the twin-range cutoff 
method, because the former is only a special case of the later. Equipped with both the error 
and the computational cost estimates, we proposed a work flow that can automatically de- 
termine the nearly optimal parameters demanding a certain target accuracy. We verified the 
effectiveness of the error estimate and parameter tuning algorithm by numerical simulations 
of Lennard- Jones 6-12 system in gas, liquid and supercritical state. We also presented some 
examples to show, by parameter tuning, to what extent the computational expense can be 
saved with respect to the unoptimized parameters. 

In the gas state, the error estimates are sharp. In the liquid and supercritical state, the 
error estimates are reasonable: they are always larger than half and smaller than twice of 
the real errors. The quality of the error estimates depends on the extent to which the as- 
sumptions in section II B are satisfied. These estimated force errors provide a quantitative 
description of the undesirable cutoff effects introduced by the single- and twin-range cutoff 
methods. The parameter tuning algorithm enables an automatic searching of the working 
parameters (i.e. the cutoff radii and the neighbor list updating frequency) that reach a 
predetermined accuracy at nearly minimal computational cost. Comparing with the unop- 
timized parameters, the tuned parameters are always faster. The benefit ranged from 0% 
to 40%, depending on the quality of the unoptimized parameters. We also demonstrated 
that the optimized twin-range cutoff method is faster than the optimized single-range cutoff 
method, so the former is recommended. Combining with the parameter tuning algorithm 
developed for the long-range interactions [21], all non-bonded interactions can be calculated 
at a nearly optimal efficiency. 

The error estimate and the parameter tuning for inhomogeneous systems were not consid- 
ered by the present paper, because the assumptions for the error estimate are worse violated. 
However, how to handle the inhomogeneous systems is a very important open question in 
this field, which requires further studies. 
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